
United States Patent and Trademark Office 



UNITED STATES DEPARTMENT OF COMMERCE 
United States Patent and Trademark Office 
Address: COMMISSIONER FOR PATENTS 
P.O. Box 1450 

Alexandria, Virginia 22313-1450 
www.uspto.gov 



| APPLICATION NO. | 


FILING DATE | 


FIRST NAMED INVENTOR 


| ATTORNEY DOCKET NO. | 


CONFIRMATION NO. | 


10/812,561 


03/30/2004 


Lei Duan 


50T5601.01/1696 


3671 



24272 7590 

Gregory J. Koerner 
Redwood Patent Law 
1291 East Hillsdale Boulevard 
Suite 205 

Foster City, CA 94404 



10/09/2007 



EXAMINER 



HERNANDEZ, JOS1AH J 



ART UNIT 



2626 



PAPER NUMBER 



MAIL DATE 



DELIVERY MODE 



10/09/2007 



PAPER 



Please And below and/or attached an Office communication concerning this application or proceeding. 

The time period for reply, if any, is set in the attached communication. 



PTOL-90A (Rev. 04/07) 



r 



Office Action Summary 


Application No. 

* 

10/812,561 


Applicant(s) 

DUANETAL 


Examiner 
Josiah Hernandez 


Art Unit 

2626 





-- The MAILING DATE of this communication appears on the cover sheet with the correspondence address — 

Period for Reply 



A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) OR THIRTY (30) DAYS, 
WHICHEVER IS LONGER, FROM THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a), In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If NO period for reply rs specified above, the maximum statutory period wit) apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 
Any reply received by the Office later than three months after t he mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1)S Responsive to communication(s) filed on 30 March 2004 . 
2a)Q This action is FINAL. 2b)B This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 11, 453 O.G. 213. 

Disposition of Claims 

4) S Claim(s) 1-42 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) ^ Claim(s) 1-42 is/are rejected. 

7) D Claim(s) is/are objected to. . 

8) D Claim(s) are subject to restriction and/or election requirement. 

r 

Application Papers 

9) D The specification is objected to by the Examiner. 

10) K The drawing(s) filed on 30 March 2004 is/are: a)l3 accepted or b)Q objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1.85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 

11) D The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 119 

12) D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (0. 
a)D All b)Q Some * c)D None of: 

1 .□ Certified copies of the priority documents have been received. 

2. D Certified copies of the priority documents have been received in Application No. . 

3. D Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 



Attachment(s) 

1 ) [X] Notice of References Cited (PTO-892) 

2) tZ] Notice of Draftsperson's Patent Drawing Review (PTO-948) 



3) U Information Disclosure Statement(s) (PTO/SB/08) 
Paper No(s)/Mail Date 06/10/2004 . 



4) Q Interview Summary (PTO-413) 

Paper No(s)/Mail Date. . 

5) " 



Notice of Informal Patent Application 



6) LJ Other: 



U.S. Patent and Trademark Office 

PTOL-326 (Rev. 08-06) 



Office Action Summary 



Part of Paper NoVMail Date 20071004 



Application/Control Number: 10/812,561 
Art Unit: 2626 



Page 2 



DETAILED ACTION 



Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 1 , 3, 5, 7-1 1 , 1 6, 21 , 23, 25, 27-31 , 36, 41 , and 42 are rejected under 35 
U.S.C. 103(a) as being unpatentable over Gao et al. (US 7,275,029) in view of 
Bellegarda (US 6,778,952). 

As to claims 1,21,41, and 42, Gao discloses a system for optimizing 
speech recognition procedures (abstract lines 1-2), comprising: initial language 

» 

models each created by combining source models (column 14 lines 35-40) 
according to interpolation coefficients that define proportional relationships for 
combining said source models (column 14 lines 40-45); a speech recognizer 
(column 5 lines 26-29) that utilizes said initial language models to process input 
development data (probability is calculated from each of the models to create an 
optimum combination, column 14 lines 22-40) for calculating probability that each 
correspond to a different one of said initial language models (column 14 lines 38- 
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43); and an optimized language model selected from said initial language 
models by identifying an optimal probability from among said probabilities 
(column 14 lines 35-45), said speech recognizer utilizing said optimized language 
model for performing said speech recognition procedures (column 5 lines 26-29). 

Gao does not disclose specifically using or calculating word-error rates. 
Bellagarda teaches calculating and combining the probabilities of the language 
models (abstract lines 1-3) in order to reduce in an optimal form the word-error 
rate (column 1 lines 38-45). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
word-error rates as taught by Bellegarda. Doing so would have allowed to 
optimize the language model by using the word-error rate as an indicator of how 
much to rely on a particular language model. 

As to claims 3 and 23, Gao discloses the system of claim 1 wherein said 
initial language models are implemented as statistical language models that 
include N-grams and probability values that each correspond to one of said N- 
grams (column 8 lines 15-23). 

* 

r 

As to claims 5 and 25, Gao does not disclose specifically the system of 
claim 1 wherein said source models are each similarly implemented as statistical 
language models that include N-grams and probability values that each 
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correspond to one of said N-grams. Bellegarda teaches using statistical models 
(column 3 lines 1-8) that include N-grams and probability values (column 4 lines 
38-48). 

v 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
N-grams and probability values as taught by Bellegarda. It is commonly known 
in the art that N-grams and probability values are using in statistical models. 

As to claims 7 and 27, Gao discloses the system of claim 1 wherein sets 
of said interpolation coefficients are each associated with a different one of said 
source models to define how much said different one of said source models 
contributes to a corresponding one of said initial language models (the 
interpolation coefficients are combined in weighted amounts, column 14 lines 40- 
45). 

4 

As to claims 8 and 28, Gao discloses the system of claim 1 wherein said 

* 

interpolation coefficients are each multiplied with a different one of said source 
models to produce a series of weighted source models that are then combined to 
produce a corresponding one of said initial language models (the interpolation 
coefficients are combined in weighted amounts, column 14 lines 40-45). 



I 

I • 
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As to claims 9 and 29, Gao discloses the system of claim 1 wherein said 
initial language models are each calculated by a formula: 
LM= lambda.. sub. 1SM. sub. 1+.lambda..sub.2SM.sub.2- + . . . 
+.lambda..sub.nSM.sub.n where said LM is one of said initial language models, 
said SM.sub.1 is a first one of said source models, said SM.sub.n is a final one of 
said source models in a continuous sequence of "n" source models, and said 
. lambda. .sub.1, said .lambda.. sub.2, and said .lambda. .sub.n are said 

* 

interpolation coefficients applied to respective probability values of said source 
models to weight how much each of said source models contributes to said one 
of said initial language models (a formula is used to calculate by a percentage 
weight the amount of each language model and it is determined by the 

■ 

probabilities of optimizing the accuracy of recognition, column 14 lines 40-45). 

* 

As to claims 10 and 30, Gao does not disclose specifically the system of 

■ 

claim 1 wherein said interpolation coefficients are each greater than or equal to 
"0", and are also each less than or equal to "1" a sum of all of said interpolation 

♦ 

coefficients being equal to "1". It is inherent to one having ordinary skill in the art 
at the time the invention was made that when a formula is made using probability 
weights in a linear equation, the weights are expressed in value intervals of 0-1 , 
1 being 100% and 0 being 0%. This method is commonly used in statistical 
methods. 



» ■ 
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As to claims 1 1 and 31 , Gao discloses the system of claim 1 wherein said 
interpolation coefficients for creating said optimized language model are 
selectively chosen by analyzing effects of various combinations of said 
interpolation coefficients upon said word-error rates that correspond to 
recognition accuracy characteristics of said speech recognizer, said optimized 
language model being directly implemented by minimizing said optimal word- 
error rate through a selection of said interpolation coefficients (the combination of 
language models is done by finding the best combination (optimization of 
language performance, column 3 lines 24-30) in order to increase accuracy 
rates, column 14 lines 35-45 and column 9 lines 10-22). 

As to claims 16 and 36, Gao discloses the system of claim 1 wherein an 

* 

interpolation procedure for combining said source models into one of said initial 

r 

language models is performed by utilizing a selected initial set of said 
interpolation coefficients (column 14 lines 35-45). 

5. Claims 6 and 26 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Gao et al. (US 7,275,029) in view of Bellegarda (US 6,778,952) as applied to claim 1 
and in further view of Newman et al (US 6,151,575). 

As to claims 6 and 26, Gao or Bellegarda do not disclose specifically the 
system of claim 1 wherein each of said source models corresponds to a different 
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application domain that is related to a particular speech environment. Newman 
teaches combining an initial and source language model (abstract lines 1-5) and 
the different models corresponding to different application domain that is related 
to a particular speech environment such as particular speakers or groups of 
related speakers (column 2 lines 44-47). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
models of different domains as taught by Newman. Doing so would have 
allowed for the system to be diversified enough to have accurate speech 
recognition for different areas and domains of speech (column 2 lines 44-47). 

5. Claims 2, 4, 12, 13, 17-20, 22, 24, 32, 33, and 37-40 are rejected under 35 

* 

U.S.C. 103(a) as being unpatentable over Gao et al. (US 7,275,029) in view of 
Bellegarda (US 6,778,952) as applied to claim 1 and in further view of Mahajan et al. 
(US 6,418,431). 

■ 

As to claims 2 and 22, Gao or Bellegarda do not disclose specifically the 
system of claim 1 wherein said word-error rates are calculated by comparing a 
correct transcription of said input development data and a top recognition 
candidate from an N-best list that is rescored by a rescoring module for each of 
said initial language models. Mahajan teaches building language models 
(abstract lines 1-4) and combining features from different language models into 



* 
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one (abstract lines 9-14) and calculating N-best list and rescoring and combining 
(column 7 lines 15-22). 

It would have been obvious to one having ordinary skill in the art at the 

time the invention was made to have modified the method of Gao with the use of 

■ 

N-best lists as taught by Mahajan. Doing so would have allowed optimizing the 
language model by using the N-best lists as an indicator of how much to rely on a 
particular language model. 

As to claims 4 and 24, Gao or Bellegarda do not disclose the system of 

■ 

< 

claim 1 wherein said input development data includes a pre-defined series of 
word sequences from which said recognizer rescores a corresponding N-best list 
for calculating said word-error rates. Mahajan teaches building language models 
(abstract lines 1-4) and combining features from different language models into 
one (abstract lines 9-14) and calculating N-best list and rescoring and combining 
(column 7 lines 15-22). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
N-best lists as taught by Mahajan. Doing so would have allowed optimizing the 
language model by using the N-best lists as an indicator of how much to rely on a 
particular language model. 
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As to claims 12 and 32, Gao or Bellegarda do not disclose specifically the 
system of claim 1 wherein a rescoring module repeatedly processes said input 
development data to rescore an N-best list of recognition candidates for 

4 

calculating said word-error rates by comparing a top recognition candidate to 
said input development data, said recognition candidates each including a 
recognition result in a text format, and a corresponding recognition score. 
Mahajan teaches building language models (abstract lines 1-4) and combining 

4 

features from different language models into one (abstract lines 9-14) and 
calculating N-best list and rescoring and combining (column 7 lines 15-22). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
N-best lists as taught by Mahajan. Doing so would have allowed optimizing the 
language model by using the N-best lists as an indicator of how much to rely on a 
particular language model. 

* 

■ , 

As to claims 13 and 33, Gao or Bellegarda do not disclose specifically the 
system of claim 1 wherein each of said word-error rates are calculated by 
comparing a correct transcription of said input development data and a top 
recognition candidate from an N-best list of recognition candidates provided by 
said speech recognizer after processing said input development data, said top 
recognition candidate corresponding to a best recognition score from said 
speech recognizer. Mahajan teaches building language models (abstract lines 1- 
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4) and combining features from different language models into one (abstract lines 
9-14) and calculating N-best list and rescoring and combining (column 7 lines 15- 

- 22). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
N-best lists as taught by Mahajan. Doing so would have allowed optimizing the 
language model by using the N-best lists as an indicator of how much to rely on a 
particular language model. 

* 

As to claims 17 and 37, Gao or Bellegarda do not disclose specifically the 
system of claim 16 wherein a rescoring module rescores an N-best list of 
recognition candidates after utilizing said one of said initial language models to 
perform a recognition procedure upon said input development data. Mahajan 
teaches building language models (abstract lines 1-4) and combining features 
from different language models into one (abstract lines 9-14) and calculating N- 
best list and rescoring and combining (column 7 lines 15-22). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
N-best lists as taught by Mahajan. Doing so would have allowed optimizing the 
language model by using the N-best lists as an indicator of how much to rely on a 
particular language model. 
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As to claims 18 and 38, Gao or Bellegarda do not disclose specifically the 
system of claim 17 wherein one of said word-error rates corresponding to said 
one of said initial language models is calculated and stored based upon a 
comparison between a correct transcription of said input development data and a 
top recognition candidate from said N-best list. Mahajan teaches building 
language models (abstract lines 1-4) and combining features from different 
language models into one (abstract lines 9-14) and calculating N-best list and 

• » 

rescoring and combining (column 7 lines 15-22). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
N-best lists as taught by Mahajan. Doing so would have allowed optimizing the 
language model by using the N-best lists as an indicator of how much to rely on a 
particular language model. 

As to claims 19 and 39, Gao discloses the system of claim 18 wherein 
said selected initial set of said interpolation coefficients are each iteratively 
altered by a pre-defined amount to produce subsequent sets of said interpolation 
coefficients (column 14 lines 35-45). 

As to claims 20 and 40, Gao discloses the system of claim 19 wherein 
subsequent initial language models are created by utilizing said subsequent sets 
of interpolation coefficients (column 14 lines 35-45). 



f 

1 

* • 
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Gao or Bellegarda do not disclose specifically a rescoring module 
iteratively utilizing said subsequent initial language models to rescore said N-best 
list for calculating subsequent word-error rates, said optimized language model 
being selected by identifying said optimal word-error rate when a pre-determined 
number of said subsequent word-error rates have been calculated. Mahajan 
teaches building language models (abstract lines 1-4) and combining features 
from different language models into one (abstract lines 9-14) and calculating N- 
best list and rescoring and combining (column 7 lines 15-22). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
N-best lists as taught by Mahajan. Doing so would have allowed optimizing the 
language model by using the N-best lists as an indicator of how much to rely on a 
particular language model. 

5. Claims 14, 15, 34, and 35 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Gao et al. (US 7,275,029) in view of Bellegarda (US 6,778,952) as 
applied to claim 1 and in further view of Deligne et al. (US PGPUB 2004/0199385). 

* 

As to claims 14 and 34, Gao or Bellegarda do not disclose specifically the 
system of claim 1 wherein said word-error rates are calculated to include one or 
more substitutions in which a first incorrect word has been substituted for a first 
correct word in a recognition result, said word-error rates also including one or 



4 
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more deletions in which a second correct word has been deleted from said 
recognition result, said word-error rates further including one or more insertions 
in which a second incorrect word has been inserted into said recognition result. 
Deligne teaches techniques for improving speech processing (abstract lines 1-4) 
and that include a word-error rate calculation (paragraph [0048]) of which include 
in their formula deletions, insertions, and substitutions (paragraph [0045]). 

It would have been obvious to one having ordinary skill in the art at the 
time the invention was made to have modified the method of Gao with the use of 
deletions, insertions, and substitutions as disclosed by Deligne. Doing so would 
have provided a method to measure quality of the language models (paragraph 
[0045] lines 1-3). 

As to claims 15 and 35, Gao or Bellegarda do not disclose the system of 
claim 1 wherein said word-error rates are each calculated according to a formula: 
WER=(Subs+Deletes+lnserts)/Total Words in Correct Transcription where said 
WER is one of said word-error rates corresponding to one of said initial language 
models, said Subs are substitutions in a recognition result, said Deletes are 
deletions in said recognition result, said Inserts are insertions in said recognition 
result, and said Total Words in Correct Transcription is a total number of words in 
a correct transcription of said input development data. . Deligne teaches 
techniques for improving speech processing (abstract lines 1-4) and that include 
a word-error rate calculation (paragraph [0048]) of which include in their formula 
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deletions, insertions, and substitutions in the form of a formula (paragraphs 
[0045]-[0046]) 

It would have been obvious to one Having ordinary skill in the art at the 
f time the invention was made to have modified the method of Gao with the use of 
deletions, insertions, and substitutions as disclosed by Deligne. Doing so would 
have provided a method to measure quality of the language models (paragraph 
[0045] lines 1-3). 



Conclusion 

* 

A note has been made to notify the appropriate parties that the examiner 
has moved from Art Unit 2609 to 2626. 

Any inquiry concerning this communication should be directed to Josiah 
Hernandez whose telephone number is 571-270-1646. The examiner can 
normally be reached from 7:30 pm to 5:00 pm. 

If attempts to reach the examiner by telephone are unsuccessful, the 

« 

examiner's supervisor, David Hudspeth can be reached on (571) 272-7843. The 
fax phone number for the organization where this application or proceeding is 
assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 



PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- 
free). 



JH 




DAVID HUDSPETH 
SUPERVISORY PATENT EXAMINER 

TECHNOLOGY CENTER 2600 



